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Multimedia data is growing rapidly in the current digital era, one of which is 
digital image data. The increasing need for a large number of digital image 
datasets makes the constraints faced eventually drain a lot of time and cause 
the process of image description to be inconsistent. Therefore, a method is 
needed in processing the data, especially in searching digital image data in 


large image dataset to find image data that are relevant to the query image. 





One of the proposed methods for searching information based on image 
content is content based image retrieval (CBIR). The main advantage of the 
CBIR method is automatic retrieval process, compared to traditional 
keyword. This research was conducted on a combination of the HSV color 
histogram methods and the discrete wavelet transform to extract color 
features and textures features, while the chi-square distance technique was 
used to compare the test images with images into a database. The results 
have showed that the digital image search system with color and texture 
features have a precision value of 37.5%-100%, with an average precision 
value of 80.71%, while the percentage accuracy is 93.7%-100% with an 
average accuracy is 98.03%. 
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1. INTRODUCTION 

The world today has many uses for many digital devices to obtain images. Nowadays, it becomes 
easy to store huge amount of images by using image processing techniques. Digital image is one of the most 
widely created multimedia data and is used in various needs in the modern era. Meanwhile, the need to find 
useful information in digital image datasets is needed at this time. In managing data a better system in need, 
especially in searching for digital image data, so the need to find image files on a computer that has a large 
image database can be met. The textual image search method using keywords given to each image data can 
spend a lot of time and cause the image description process to be inconsistent. The rapid access to these 
masses collections of images and retrieve similar images of a given image (Query) from this huge collection 
of images presents major challenges and requires efficient algorithms. 

This research proposes the method that was emerged later to use the features contained in a digital 
image to index image datasets, this method is better known as content based image retrieval (CBIR) [1]. The 
CBIR method is used to index digital image datasets based on image color and texture features. Some 
research studies on CBIR have been used based on leaf color features [2], color and texture features [3-5]. 
Using gradient vector flow snake (GVFS) method and the CBIR technique in implementing an application 
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search [6]. Not only that, the CBIR has been applied using the pyramid histogram of oriented gradients 
(PHOG) approach in extracting shape [7]. 

The research presented in this area is focused to reduce the semantic gap between the image feature 
representation and human visual understanding. In CBIR and image classification-based models, high-level 
image visuals are represented in the form of feature vectors that consists of numerical values [8]. In this 
study, we design the interface of a digital image search system. The purpose of this study is to display 
relevant images in image datasets based on searching by the query image using CBIR method based on color 
and texture features. 

In previous studies related to feature extraction in CBIR, among others, Insan Taufik et. al [9] in 
their research have implemented HSV color detection for features color extraction in images. Based on this 
research, it is stated that the HSV model can represent a point in the RGB color model, which re-arranges 
RGB geometry in an effort to perceptual more relevant than cartesian coordinate representations. User 
control through color sample and color tolerance as the reference filter so that the right color can be obtained. 
HSV color detection is quite effective for detecting colors in natural color images and tends to be more stable 
in changes in light. But in this study, the application uses the extraction of texture and color features. The 
method applied is a discrete wavelet transform for extracting texture features, and HSV color histograms for 
extracting color features. By using these two feature extractions, it is expected to produce better CBIR 
techniques and can assist in the process of retrieval images that are more accurate and relevant to the user. 


2. LITERATURE STUDY 

CBIR is the automatic retrieval of digital images from large databases. The CBIR systems identify 
the images by automatically extracted syntactical features. This technique makes use of the inherent visual 
contents of an image to perform a query [10]. Figure 1 shows a typical CBIR system automatically extract 
visual attributes (colour and texture) of each image in the database based on its pixel values and stores in a 
different database within the system called feature database. 

In this process, the users usually formulate a request image and presents it to the system. The system 
automatically extracts the visual attributes of the query image in the same mode as it does for each database 
image, and then identifies images from within the database whose feature vectors match those of the query 
image, and sorts the best similar objects according to their similarity value. During operation, the system 
processes less compact feature vectors rather than the large size image data thus giving CBIR its cheap, fast 
and efficient advantage over text-based retrieval. 
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Figure 1. Process involved in content based image retrieval 


3. RESEARCH METHOD 
3.1. Feature extraction 
3.1.1. Color histogram 

A color histogram [11] is a representation of the color distribution in the image. For digital images, 
the color histogram represents the number of pixels that have colors in each group, with a certain color range 
that includes the color space of the image. In a CBIR system, a color histogram is an effective approach to 
implement an image retrieval system [12]. 

Based on the applied method, at this stage, we use the HSV color space because the hue and 
saturation are close to the reported human visual system [13-15]. The image is divided into fields H, S, V, 
and each field is quantized, by determining the value of the quantization level. Maximum H, S, and V are 
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obtained to make the final histogram and subsequently normalized [16]. The image in the RGB color space 
can easily be transformed to HSV color space using [17]. 





+2R-G-B 
H = arccos = (1) 
y (R-G)2—(G-B)(R-B) 
gs MaRe P mne (2) 
V = maxR,G,B (3) 


3.1.2. Discrete wavelet transforms 

Wavelets are a little wave that is concentrated in frequency around a certain point. Fourier 
transforms deal only with the frequency component in a signal while temporal details are not available [12]. 
Wavelet transform has a wide application in the image processing system. In this study, applying discrete 
wavelet transform 2-D method, we see that in the decomposition process that the 2-D discrete wavelet 
transformation is done by processing rows and columns separately which can be illustrated by the following 
Figure 2: 

Discrete wavelet transform (DWT) is an implementation of the wavelet transform using a discrete 
set of wavelet scales and translations that are obeying some defined rules or this transform decomposes the 
signal into a mutually orthogonal set of wavelets [18]. DWT has stronger resistance than LWT, DWT is 
based on transforming Fourier which has more intensive changes to better images and is able to produce 
watermarks with higher NCC [19]. 

In this study, the wavelet decomposition of grayscale image regions into four sub-images (LL, HL, 
LH and HH) [3] called wavelet or DWT sub-bands. Every sub-band is the result of a transformation that has 
a quarter of the original size of aimage before changing. On the low frequency Sub-band LL produces the 
images that are similar to the converted images. So it's also called the approximation coefficient. Whereas, 
LH, HL and HH are coefficient of detail because it displays a very smooth image containing an image [20]. 
Calculating the mean and variation of four sub-images correspondings to each region and concatenating 
them. Two vectors will be obtained which describe texture information of image. Both of these vectors are 
normalized to get the texture feature histogram [21]. 








Row decomposition Column decomposition 


Figure 2. Level 1 2D wavelet transform 


3.2. Similarities measurement 

This research uses distance based technique to take CBIR. The measurement of the histogram vector 
feature distance used for the image matching method is Chi-square distance which was used successfully in 
face image analysis [22]. The results of these experiments indicate that this metric is more accurate than the 
Euclidean distance in the histogram feature [23]. 

Furthermore, for this stage calculation, the distance between query image and database image. This 
distance is a bin-to-bin histogram comparison, which considers the difference of bins as well and their sizes 
[24]. Images are then sorted in ascending order from the distance calculation result. Chi-square distance is 
given by the following (4): 


(X,Y) =g E2 4) 
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where x and y are two histogram data and i indicates bin index in these histograms. 


3.3. Proposed research 

The overall working flow is shown in Figure 3 and our proposed research has the following steps: a) 
The first step is to prepare a database image in the form of image data that has been prepared for research; b) 
The database image will go through the pre-processing stage, which is the process of resizing and converting 
the database image; c) Extract color and texture features from the inserted database image, and then save it in 
CSV file format; d) Enter the query image used for searching; e) Extract color and texture features from the 
query image; f) Calculate the chi-squared distance between the query image and the database image and sort 
them from the smallest score to the largest; g) Displays the image of search results sorted on the system 
interface. 


( Start ) 





/ : : / 
























































Reference j diaria 
Images / : : / jery Imag / 
L / : : / / 
Preprocessing i : ; 
: repr 
(Resize images & : : th hepa 
: : (color space 
color space : 7 j 
A : : conversion) 
conversion) : : 
/ Preprocessing / i : / Preprocessing / j 
Result / h ; / Result / 
Extract Color & : : Extract Color & 
Texture Features : : Texture Features 
(Color Histogram & : Similarity Measurement : (Color Histogram & 
Discrete wavelet : (Chi-square Distance) J Discrete wavelet 
transform) : : transform) 
Extract / ; Extract / 
features / | : features / 
result Sort Results by : result / 
Relevancy | : 


Normalized features 








: f Display Results to j 
: / User / 
H / f 

/ TN : i á 

| Database of : 

| features | ] : | 

\ \/ : j ~ 


( Finish ) 


Figure 3. The flow diagram of proposed research 


4. RESULTS AND DISCUSSION 
4.1. Datasets and performance measurement 

In this study uses the INRIA Holidays dataset. Here, 200 images containing 20 image categories are 
taken from INRIA Holidays database for our experimental analysis. The image categories are boat, temples, 
mountains, flowers, fruits and houses. Where each category having 10 images. An important task used to 
determine the accuracy of the system takes a performance. Accuracy calculations can be obtained using 
precision and recall. Precision is used to extract an image [25]. The evaluation formula used is shown in the 
following (5). 


N Number of relevant retrieved images 
Precision = —£§$ mA 


(5) 


Total number of relevant images 


Recall is used to extract all suitable images from the image database [25]. The evaluation formula 
used is shown in the following (6). 
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Number of relevant retrieved images 


Recall = (6) 


Total number of relevant images in database 
4.2. Experimental results 

Our proposed method using color and texture based on feature extraction technique applied on 200 
images of INRIA image database. In the evaluation stage, as shown in Figure 4(a) that two pictures are taken 


randomly from each category and computed the precision and recall for each of them. Then in Figure 4(b) 
shows the results of the average precision and average recall for each measured category. 
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(b) 
Figure 4. INRIA image database, (a) Snapshot of the data set, (b) Retrieved result with an input test image 


In this study, a digital image search system using the CBIR method will be tested in three stages. In 
Table 1 shows the CBIR system experiment using texture features and Table 2 shows the CBIR system 
experiment using color features, while in Table 3, the CBIR system uses combined color and texture features. 

In testing, by using the CBIR method based on the texture features in Table 1, the maximum 
efficiency of the results has yielded a precision level of 83% for the 'mountain' and 'sunset' image categories 
with an average precision of 40%, recall of 32%, and accuracy of 94%, while the results experiments based 
on the color features in Table 2 maximum results can reach 100% in several image categories such as 'boat', 
‘cone’, 'temple' with an average precision and recall of 73% and accuracy of 97%. From the test results 
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shown in Table 3, it can be seen that based on the combination of color and texture features, we have 
provided a maximum efficiency of 100% results in more categories of images compared to CBIR method 
testing based on color and texture features separately. We have obtained an average value of Precision 
80.71%, Recall 80%, and Accuracy 98.03%. Simulation results show that schemes with color features are 
better than schemes with texture features. Schemes with combined color and texture features provide the best 
results. This approach achieves better performance for picture taking than schemes by using separate color 
and texture features. 


Table 1. The performance of CBIR based on texture Table 2. The performance of CBIR based on color 











feature feature 

Query Image _ Precision (%) _ Recall (%) _ Accuracy (%) Query Image _ Precision (%) _ Recall (%) Accuracy (%) 
Sunrise 50 50 95 Sunrise 75 75 97.5 
Mountain 83.3 62.5 97.5 Mountain 87.5 87.5 98.7 
Lagoon 28.5 25 93.1 Lagoon 62.5 62.5 96.2 
Valley 0 0 91.8 Valley 37.5 37.5 93.7 
Harbor 40 25 94.3 Harbor 87.5 87.5 98.7 
Boat 37.5 37.5 93.7 Boat 100 100 100 
Petra 33.3 12.5 94.3 Petra 37.5 37.5 93.7 
Pyramid 66.6 50 96.2 Pyramid 87.5 87.5 98.7 
House 14.2 12.5 91.8 House 75 75 97.5 
Coral 80 50 96.8 Coral 87.5 87.5 98.7 
Fruit 42.8 37.5 94.3 Fruit 37.5 37.5 93.7 
Flower 50 50 95 Flower 37.5 37.5 93.7 
Cone 25 12.5 93.7 Cone 100 100 100 
Temple 28.5 25 93.1 Temple 100 100 100 
Paint 14.2 12.5 91.8 Paint 87.5 87.5 98.7 
Settlement 16.6 12.5 92.5 Settlement 87.5 87.5 98.7 
Canoe 50 50 95 Canoe 87.5 87.5 98.7 
Sunset 83.3 62.5 97.5 Sunset 75 75 97.5 
Garden 33.3 25 93.7 Garden 87.5 87.5 98.7 
Venice 42.8 375 94.3 Venice 25 25 92.5 
Average 40.99 32.5 94.27 Average 73.12 73.12 97.28 








Table 3. The performance of CBIR based on color and texture features 
Query Image _Precision (%) _ Recall (%) _Accuracy (%) 








Sunrise 87.5 87.5 98.7 
Mountain 87.5 87.5 98.7 
Lagoon 62.5 62.5 96.2 
Valley 42.8 37.5 94.3 
Harbor 87.5 87.5 98.7 
Boat 100 100 100 
Petra FA 37.5 93.7 
Pyramid 100 100 100 
House 87.5 87.5 98.7 
Coral 100 100 100 
Fruit 71.4 62.5 96.8 
Flower 13 75 97.5 
Cone 100 100 100 
Temple 100 100 100 
Paint 87.5 87.5 98.7 
Settlement 87.5 87.5 98.7 
Canoe 87.5 87.5 98.7 
Sunset 75 75 97.5 
Garden 87.5 87.5 98.7 
Venice 50 50 95 
Average 80.71 80 98.03 





5. CONCLUSION 

In this paper, discrete wavelet transforms are added to features extraction and chi-squared distance 
for similarity measurement from previous studies to make research results more accurate. Based on the 
research obtained results, it can be concluded that the CBIR method has been successfully applied to digital 
image search systems for image feature extraction with the dataset used by 200 images. Image retrieval is 
simulated 20 times with different queries. The queries used in the simulation are taken randomly. Evaluations 
of color and texture that are based on image capture performance are measured with precision, memory, and 
accuracy, which generally show that performance using color features is better than texture features. 
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Combining two or more features provides better results than a single feature. Color features and textures that 
are more precisely used to form feature vectors and similarities that can be matched with chi-squared 
distances. 
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